智能论文笔记

Multi-Attribute Open Set Recognition

Piyapat Saranrittichai , Chaithanya Kumar Mummadi , Claudia Blaiotta , Mauricio Munoz , Volker Fischer

分类：计算机视觉

2022-08-14

通过同时对已知类别进行分类并识别未知类别，将图像分类扩展到开放世界设置。尽管常规的OSR方法可以检测到分布（OOD）样本，但它们无法提供说明，表明哪些基本视觉属性（例如，形状，颜色或背景）导致特定样本未知。在这项工作中，我们介绍了一个新的问题设置，该设置将常规OSR推广到一个多属性设置，其中同时识别了多个视觉属性。在这里，不仅可以识别OOD样本，而且可以按其未知属性进行分类。我们提出了简单的常见OSR基线的扩展，以处理这种新颖的情况。我们表明，当培训数据集中存在虚假相关性时，这些基准很容易受到捷径。这导致了OOD性能差，根据我们的实验，这主要是由于预测的置信度得分的意外交叉分类相关性。我们提供了一个经验证据，表明这种行为在合成和现实世界数据集的不同基准之间是一致的。

translated by 谷歌翻译

Overcoming Shortcut Learning in a Target Domain by Generalizing Basic Visual Factors from a Source Domain

Piyapat Saranrittichai , Chaithanya Kumar Mummadi , Claudia Blaiotta , Mauricio Munoz , Volker Fischer

分类：计算机视觉 | 人工智能

2022-07-20

当深层神经网络过于依赖培训数据集中的虚假相关性以解决下游任务时，就会发生快捷学习。先前的工作表明，这如何损害深度学习模型的组成概括能力。为了解决这个问题，我们提出了一种新的方法来减轻不受控制的目标域中的快捷方式学习。我们的方法使用附加的数据集（源域）扩展了训练集，该数据集（源域）是专门设计的，旨在促进学习基本视觉因素的独立表示。我们基于我们明确控制快捷机会以及现实世界目标域的合成目标域的想法。此外，我们分析了源域的不同规格和网络体系结构对组成概括的影响。我们的主要发现是，从源域中利用数据是减轻快捷方式学习的有效方法。通过促进学习表示的不同因素的独立性，网络可以学会仅考虑预测因素，并忽略推断期间潜在的快捷因素。

translated by 谷歌翻译

Constructing Organism Networks from Collaborative Self-Replicators

Steffen Illium , Maximilian Zorn , Cristian Lenta , Michael Kölle , Claudia Linnhoff-Popien , Thomas Gabor

分类：神经与进化计算 | 机器学习

2022-12-20

We introduce organism networks, which function like a single neural network but are composed of several neural particle networks; while each particle network fulfils the role of a single weight application within the organism network, it is also trained to self-replicate its own weights. As organism networks feature vastly more parameters than simpler architectures, we perform our initial experiments on an arithmetic task as well as on simplified MNIST-dataset classification as a collective. We observe that individual particle networks tend to specialise in either of the tasks and that the ones fully specialised in the secondary task may be dropped from the network without hindering the computational accuracy of the primary task. This leads to the discovery of a novel pruning-strategy for sparse neural networks

translated by 谷歌翻译

Empirical Analysis of Limits for Memory Distance in Recurrent Neural Networks

Steffen Illium , Thore Schillman , Robert Müller , Thomas Gabor , Claudia Linnhoff-Popien

分类：机器学习 | 计算机视觉

2022-12-20

Common to all different kinds of recurrent neural networks (RNNs) is the intention to model relations between data points through time. When there is no immediate relationship between subsequent data points (like when the data points are generated at random, e.g.), we show that RNNs are still able to remember a few data points back into the sequence by memorizing them by heart using standard backpropagation. However, we also show that for classical RNNs, LSTM and GRU networks the distance of data points between recurrent calls that can be reproduced this way is highly limited (compared to even a loose connection between data points) and subject to various constraints imposed by the type and size of the RNN in question. This implies the existence of a hard limit (way below the information-theoretic one) for the distance between related data points within which RNNs are still able to recognize said relation.

translated by 谷歌翻译

VoronoiPatches: Evaluating A New Data Augmentation Method

Steffen Illium , Gretchen Griffin , Michael Kölle , Maximilian Zorn , Jonas Nüßlein , Claudia Linnhoff-Popien

分类：计算机视觉 | 机器学习

2022-12-20

Overfitting is a problem in Convolutional Neural Networks (CNN) that causes poor generalization of models on unseen data. To remediate this problem, many new and diverse data augmentation methods (DA) have been proposed to supplement or generate more training data, and thereby increase its quality. In this work, we propose a new data augmentation algorithm: VoronoiPatches (VP). We primarily utilize non-linear recombination of information within an image, fragmenting and occluding small information patches. Unlike other DA methods, VP uses small convex polygon-shaped patches in a random layout to transport information around within an image. Sudden transitions created between patches and the original image can, optionally, be smoothed. In our experiments, VP outperformed current DA methods regarding model variance and overfitting tendencies. We demonstrate data augmentation utilizing non-linear re-combination of information within images, and non-orthogonal shapes and structures improves CNN model robustness on unseen data.

translated by 谷歌翻译

AWT -- Clustering Meteorological Time Series Using an Aggregated Wavelet Tree

Christina Pacher , Irene Schicker , Rosmarie deWit , Katerina Hlavackova-Schindler , Claudia Plant

分类：机器学习

2022-12-13

Both clustering and outlier detection play an important role for meteorological measurements. We present the AWT algorithm, a clustering algorithm for time series data that also performs implicit outlier detection during the clustering. AWT integrates ideas of several well-known K-Means clustering algorithms. It chooses the number of clusters automatically based on a user-defined threshold parameter, and it can be used for heterogeneous meteorological input data as well as for data sets that exceed the available memory size. We apply AWT to crowd sourced 2-m temperature data with an hourly resolution from the city of Vienna to detect outliers and to investigate if the final clusters show general similarities and similarities with urban land-use characteristics. It is shown that both the outlier detection and the implicit mapping to land-use characteristic is possible with AWT which opens new possible fields of application, specifically in the rapidly evolving field of urban climate and urban weather.

translated by 谷歌翻译

Multi-view Graph Convolutional Networks with Differentiable Node Selection

Zhaoliang Chen , Lele Fu , Shunxin Xiao , Shiping Wang , Claudia Plant , Wenzhong Guo

分类：机器学习

2022-12-09

Multi-view data containing complementary and consensus information can facilitate representation learning by exploiting the intact integration of multi-view features. Because most objects in real world often have underlying connections, organizing multi-view data as heterogeneous graphs is beneficial to extracting latent information among different objects. Due to the powerful capability to gather information of neighborhood nodes, in this paper, we apply Graph Convolutional Network (GCN) to cope with heterogeneous-graph data originating from multi-view data, which is still under-explored in the field of GCN. In order to improve the quality of network topology and alleviate the interference of noises yielded by graph fusion, some methods undertake sorting operations before the graph convolution procedure. These GCN-based methods generally sort and select the most confident neighborhood nodes for each vertex, such as picking the top-k nodes according to pre-defined confidence values. Nonetheless, this is problematic due to the non-differentiable sorting operators and inflexible graph embedding learning, which may result in blocked gradient computations and undesired performance. To cope with these issues, we propose a joint framework dubbed Multi-view Graph Convolutional Network with Differentiable Node Selection (MGCN-DNS), which is constituted of an adaptive graph fusion layer, a graph learning module and a differentiable node selection schema. MGCN-DNS accepts multi-channel graph-structural data as inputs and aims to learn more robust graph fusion through a differentiable neural network. The effectiveness of the proposed method is verified by rigorous comparisons with considerable state-of-the-art approaches in terms of multi-view semi-supervised classification tasks.

translated by 谷歌翻译

MedalCare-XL: 16,900 healthy and pathological 12 lead ECGs obtained through electrophysiological simulations

Karli Gillette , Matthias A. F. Gsell , Claudia Nagel , Jule Bender , Bejamin Winkler , Steven E. Williams , Markus Bär , Tobias Schäffter , Olaf Dössel , Gernot Plank

分类：机器学习

2022-11-29

Mechanistic cardiac electrophysiology models allow for personalized simulations of the electrical activity in the heart and the ensuing electrocardiogram (ECG) on the body surface. As such, synthetic signals possess known ground truth labels of the underlying disease and can be employed for validation of machine learning ECG analysis tools in addition to clinical signals. Recently, synthetic ECGs were used to enrich sparse clinical data or even replace them completely during training leading to improved performance on real-world clinical test data. We thus generated a novel synthetic database comprising a total of 16,900 12 lead ECGs based on electrophysiological simulations equally distributed into healthy control and 7 pathology classes. The pathological case of myocardial infraction had 6 sub-classes. A comparison of extracted features between the virtual cohort and a publicly available clinical ECG database demonstrated that the synthetic signals represent clinical ECGs for healthy and pathological subpopulations with high fidelity. The ECG database is split into training, validation, and test folds for development and objective assessment of novel machine learning algorithms.

translated by 谷歌翻译

A New Graph Node Classification Benchmark: Learning Structure from Histology Cell Graphs

Claudia Vanea , Jonathan Campbell , Omri Dodi , Liis Salumäe , Karen Meir , Drorith Hochner-Celnikier , Hagit Hochner , Triin Laisk , Linda M. Ernst , Cecilia M. Lindgren

分类：机器学习 | 计算机视觉

2022-11-11

We introduce a new benchmark dataset, Placenta, for node classification in an underexplored domain: predicting microanatomical tissue structures from cell graphs in placenta histology whole slide images. This problem is uniquely challenging for graph learning for a few reasons. Cell graphs are large (>1 million nodes per image), node features are varied (64-dimensions of 11 types of cells), class labels are imbalanced (9 classes ranging from 0.21% of the data to 40.0%), and cellular communities cluster into heterogeneously distributed tissues of widely varying sizes (from 11 nodes to 44,671 nodes for a single structure). Here, we release a dataset consisting of two cell graphs from two placenta histology images totalling 2,395,747 nodes, 799,745 of which have ground truth labels. We present inductive benchmark results for 7 scalable models and show how the unique qualities of cell graphs can help drive the development of novel graph neural network architectures.

translated by 谷歌翻译

ATCO2 corpus: A Large-Scale Dataset for Research on Automatic Speech Recognition and Natural Language Understanding of Air Traffic Control Communications

Juan Zuluaga-Gomez , Karel Veselý , Igor Szöke , Petr Motlicek , Martin Kocour , Mickael Rigault , Khalid Choukri , Amrutha Prasad , Seyyed Saeed Sarfjoo , Iuliia Nigmatulina

分类：自然语言处理 | 人工智能

2022-11-08

Personal assistants, automatic speech recognizers and dialogue understanding systems are becoming more critical in our interconnected digital world. A clear example is air traffic control (ATC) communications. ATC aims at guiding aircraft and controlling the airspace in a safe and optimal manner. These voice-based dialogues are carried between an air traffic controller (ATCO) and pilots via very-high frequency radio channels. In order to incorporate these novel technologies into ATC (low-resource domain), large-scale annotated datasets are required to develop the data-driven AI systems. Two examples are automatic speech recognition (ASR) and natural language understanding (NLU). In this paper, we introduce the ATCO2 corpus, a dataset that aims at fostering research on the challenging ATC field, which has lagged behind due to lack of annotated data. The ATCO2 corpus covers 1) data collection and pre-processing, 2) pseudo-annotations of speech data, and 3) extraction of ATC-related named entities. The ATCO2 corpus is split into three subsets. 1) ATCO2-test-set corpus contains 4 hours of ATC speech with manual transcripts and a subset with gold annotations for named-entity recognition (callsign, command, value). 2) The ATCO2-PL-set corpus consists of 5281 hours of unlabeled ATC data enriched with automatic transcripts from an in-domain speech recognizer, contextual information, speaker turn information, signal-to-noise ratio estimate and English language detection score per sample. Both available for purchase through ELDA at http://catalog.elra.info/en-us/repository/browse/ELRA-S0484. 3) The ATCO2-test-set-1h corpus is a one-hour subset from the original test set corpus, that we are offering for free at https://www.atco2.org/data. We expect the ATCO2 corpus will foster research on robust ASR and NLU not only in the field of ATC communications but also in the general research community.

translated by 谷歌翻译